DEPEND: A Simulation-Based Environment for System Level Dependability Analysis
نویسندگان
چکیده
The design and evaluation of highly reliable computer systems is a complex issue. Designers mostly develop such systems based on prior knowledge and experience and occasionally from analytical evaluations of simplified designs. This paper presents a simulation-based environment called DEPEND which is especially geared for the design and evaluation of fault-tolerant architectures. DEPEND is unique in that it exploits the properties of object-oriented programming to provide a flexible framework with which a user can rapidly model and evaluate various fault-tolerant systems. The paper describes the key features of the DEPEND environment and illustrates its capabilities with a detailed analysis of a real design. In particular, DEPEND is used to simulate the Unix based Tandem Integrity fault-tolerant system 1 and evaluate how well it copes with near-coincident errors caused by correlated and latent faults. Issues such as memory scrubbing, re-integration policies and workload dependent repair times which affect how the system handles near-coincident errors are also evaluated. Issues such as the method used by DEPEND to simulate error latency and, the time acceleration technique that provides enormous simulation speed up are also discussed. Unlike any other simulation-based dependability studies, the use of these approaches and the accuracy of the simulation model are validated by comparing the results of the simulations with measurements obtained from fault injection experiments conducted on a production Tandem Integrity machine. _The MTBF figures presented in this paper should not be construed to reflect the MTBF figures of an actual Tandem Integrity system because key parameters that have a direct bearing on this measure were not obtained from measurements of the Integrity system but rather from other production machines. For this reason, the results shown in this paper should only be construed to reflect the trend and behavior of a general TMR based system.
منابع مشابه
Depend: a Design Environment for Prediction and Evaluation of System Dependability
This paper describes the development of DEPEND, an integrated simulation environment for the design and dependability analysis of fault-tolerant systems. DEPEND models both hardware and software components at a functional level, and allows automatic failure injection to assess system performance and reliability. It relieves the user of the work needed to inject failures, maintain statistics and...
متن کاملIdentifying and Evaluating Critical Infrastructures - A Goal-driven Dependability Analysis Framework
Organizations increasingly depend on the correct functioning (dependability) of technological infrastructures (critical infrastructures) that are generally out of their control: banking and financial services, electricity, fuel and water supply networks, and information and telecommunication networks. Being able of clearly identifying the specific elements of these infrastructures upon which th...
متن کاملDependability Investigation of Wireless Short Range Embedded Systems: Hardware Platform Oriented Approach
A new direction in short-range wireless applications has appeared in the form of high-speed data communication devices for distances of hundreds meters. Behind these embedded applications, a complex heterogeneous architecture is built. Moreover, these short range communications are introduced into critical applications, where the dependability/reliability is mandatory. Thus, dependability conce...
متن کاملAnalysis of Emergency Department Queue System Performance: Simulation Approach Based on Experiment Design
Background: Simulation is an appropriate technique for analyzing and evaluating the dynamic behavior of complex systems. The present study aimed to develop an integrated model using a simulation approach based on designing experiments to analyze performance of the admission queue system of patients, who referred to the emergency department of the Modarres hospital. Methods: In this descriptive...
متن کاملOn the Necessity for High-availability Data Center Backends in a Distributed Wireless System
When business processes depend on the processing capabilities within a data center, the typical system architecture use a high-availability setup to maintain a high level of service. Faced with a specific machine-tomachine system consisting of many endpoints that collect and forward data to the data center we argue that the dependability of the overall system does not necessitate a high level o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Computers
دوره 46 شماره
صفحات -
تاریخ انتشار 1997